Chinese Dependency Parsing with Large Scale Automatically Constructed Case Structures
نویسندگان
چکیده
This paper proposes an approach using large scale case structures, which are automatically constructed from both a small tagged corpus and a large raw corpus, to improve Chinese dependency parsing. The case structure proposed in this paper has two characteristics: (1) it relaxes the predicate of a case structure to be all types of words which behaves as a head; (2) it is not categorized by semantic roles but marked by the neighboring modifiers attached to a head. Experimental results based on Penn Chinese Treebank show the proposed approach achieved 87.26% on unlabeled attachment score, which significantly outperformed the baseline parser without using case structures.
منابع مشابه
Treebank-Based Acquisition of Chinese LFG Resources for Parsing and Generation
This thesis describes a treebank-based approach to automatically acquire robust, wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena an...
متن کاملCharacter-Level Chinese Dependency Parsing
Recent work on Chinese analysis has led to large-scale annotations of the internal structures of words, enabling characterlevel analysis of Chinese syntactic structures. In this paper, we investigate the problem of character-level Chinese dependency parsing, building dependency trees over characters. Character-level information can benefit downstream applications by offering flexible granularit...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملTreebank-Based Acquisition of LFG Resources for Chinese
This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof-concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially exten...
متن کاملTreebank-Based Acquisition of LFG Parsing Resources for French
Motivated by the expense in time and other resources to produce hand-crafted grammars, there has been increased interest in automatically obtained wide-coverage grammars from treebanks for natural language processing. In particular, recent years have seen the growth in interest in automatically obtained deep resources that can represent information absent from simple CFG-type structured treeban...
متن کامل